AITopics | stochastic block model

Collaborating Authors

stochastic block model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bridging Maximum Likelihood and Optimal Transport for Efficient Inference and Model Selection in Stochastic Block Models

Queric, Simon, Vincent-Cuaz, Cédric, Bouveyron, Charles, Corneli, Marco

arXiv.org Machine LearningMay-28-2026

We study inference in stochastic block models (SBMs) through the lens of optimal transport (OT). We first establish that maximum likelihood variational inference (MLVI) can be interpreted as a semi-relaxed Gromov-Wasserstein (srGW) projection with entropic regularization. While this formulation yields accurate clustering, the entropic regularization prevents transport plans to be sparse, hindering intrinsic model selection. Consequently, we investigate unregularized srGW estimators, and prove that they consistently recover both the SBM connectivity matrix and latent cluster assignments in the asymptotic regime. However, this asymptotic property does not translate into reliable model selection in finite samples, and calls for additional mechanisms to promote sparsity in the inferred cluster proportions. We empirically show that such a regularized formulation yields estimators that simultaneously recover model parameters and select the number of clusters in a single optimization problem, thereby avoiding costly grid search or heuristic model selection procedures.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2605.28488

Country: Europe (0.93)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

Add feedback

Node-private community estimation in stochastic block models: Tractable algorithms and lower bounds

Marchis, Laurentiu, D'souza, Ethan, Flídr, Tomáš, Loh, Po-Ling

arXiv.org Machine LearningMay-18-2026

We study the classical problem of community recovery in stochastic block models with a fixed number of communities, with a twist: We seek algorithms that are stable with respect to node-wise changes in the graph structure, formally defined as a differential privacy constraint. The algorithms we develop are based on spectral clustering, where we introduce privacy to the community recovery pipeline in the form of directly privatizing the adjacency matrix; private PCA; private convex optimization; private low-rank matrix estimation; and private approximate subspace estimation. Straightforward applications of existing private algorithms lead to a rapid increase in the privacy parameter $ε$ in order to ensure consistent estimation under node differential privacy, in contrast with the simpler setting of edge privacy. To alleviate these issues, we develop novel algorithms based on (1) sampling from an exponential mechanism with a Lipschitz extension and (2) a general framework for constructing smooth projections from the space of undirected graphs to the space of bounded-degree graphs, which can then be combined with various edge-private algorithms. Importantly, the methods we develop are all computable in polynomial-time as a function of the number of nodes in the graph. We also develop novel lower bounds on the growth rate of $ε$ required in order to achieve consistent community estimation under node privacy. On a technical note, our paper highlights the complications that arise when analyzing private algorithms under the non-standard scaling $ε\rightarrow \infty$ and proposes some solutions. We also provide a novel application of the HGR maximal correlation from information theory in the context of accuracy amplification in PAC learning, which may be of independent interest.

artificial intelligence, machine learning, survey article, (19 more...)

arXiv.org Machine Learning

2605.15943

Genre:

Overview (0.65)
Research Report (0.63)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Security & Privacy (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Streaming Algorithms and Lower Bounds for Estimating Correlation Clustering Cost

Neural Information Processing SystemsApr-30-2026, 05:25:42 GMT

Correlation clustering is a fundamental optimization problem at the intersection of machine learning and theoretical computer science. Motivated by applications to big data processing, recent years have witnessed a flurry of results on this problem in the streaming model. In this model, the algorithm needs to process the input n-vertex graph by making one or few passes over the stream of its edges and using a limited memory, much smaller than the input size. All previous work on streaming correlation clustering has focused on semistreaming algorithms with Ω(n) memory, whereas in this work, we study streaming algorithms with much smaller memory requirements of only polylog(n) bits. This stringent memory requirement is in the same spirit of classical streaming algorithms that instead of recovering a full solution to the problem--which can be prohibitively large with such small memory as is the case in our problem--, aimed to learn certain statistical properties of their inputs.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.67)

Industry: Information Technology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Private estimation algorithms for stochastic block models and mixture models

Neural Information Processing SystemsApr-29-2026, 22:42:13 GMT

We introduce general tools for designing efficient private estimation algorithms, in the high-dimensional settings, whose statistical guarantees almost match those of the best known non-private algorithms. To illustrate our techniques, we consider two problems: recovery of stochastic block models and learning mixtures of spherical Gaussians. For the former, we present the first efficient (ε,δ)-differentially private algorithms for both weak recovery and exact recovery. Previously known algorithms achieving comparable guarantees required quasi-polynomial time. We complement these results with an information-theoretic lower bound that highlights how the guarantees of our algorithms are almost tight. For the latter, we design an (ε,δ)-differentially private algorithm that recovers the centers of the k-mixture when the minimum separation is at least O(k1/t t). For all choices of t, this algorithm requires sample complexity n kO(1)dO(t) and time complexity (nd)O(t). Prior work required either an additional additive Ω( logn) term in the minimum separation or an explicit upper bound on the Euclidean norm of the centers.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Workflow (0.69)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.92)

Add feedback

Spectral embedding for dynamic networks with stability guarantees

Neural Information Processing SystemsApr-25-2026, 23:03:29 GMT

We consider the problem of embedding a dynamic network, to obtain time-evolving vector representations of each node, which can then be used to describe changes in behaviour of individual nodes, communities, or the entire graph. Given this open-ended remit, we argue that two types of stability in the spatio-temporal positioning of nodes are desirable: to assign the same position, up to noise, to nodes behaving similarly at a given time (cross-sectional stability) and a constant position, up to noise, to a single node behaving similarly across different times (longitudinal stability). Similarity in behaviour is defined formally using notions of exchangeability under a dynamic latent position network model. By showing how this model can be recast as a multilayer random dot product graph, we demonstrate that unfolded adjacency spectral embedding satisfies both stability conditions. We also show how two alternative methods, omnibus and independent spectral embedding, alternately lack one or the other form of stability.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Algorithmic Contiguity from Low-Degree Heuristic II: Predicting Detection-Recovery Gaps

Li, Zhangsong

arXiv.org Machine LearningApr-21-2026

The low-degree polynomial framework has emerged as a powerful tool for providing evidence of statistical-computational gaps in high-dimensional inference. For detection problems, the standard approach bounds the low-degree advantage through an explicit orthonormal basis. However, this method does not extend naturally to estimation tasks, and thus fails to capture the \emph{detection-recovery gap phenomenon} that arises in many high-dimensional problems. Although several important advances have been made to overcome this limitation \cite{SW22, SW25, CGGV25+}, the existing approaches often rely on delicate, model-specific combinatorial arguments. In this work, we develop a general approach for obtaining \emph{conditional computational lower bounds} for recovery problems from mild bounds on low-degree testing advantage. Our method combines the notion of algorithmic contiguity in \cite{Li25} with a cross-validation reduction in \cite{DHSS25} that converts successful recovery into a hypothesis test with lopsided success probabilities. In contrast to prior unconditional lower bounds, our argument is conceptually simple, flexible, and largely model-independent. We apply this framework to several canonical inference problems, including planted submatrix, planted dense subgraph, stochastic block model, multi-frequency angular synchronization, orthogonal group synchronization, and multi-layer stochastic block model. In the first three settings, our method recovers existing low-degree lower bounds for recovery in \cite{SW22, SW25} via a substantially simpler argument. In the latter three, it gives new evidence for conjectured computational thresholds including the persistence of detection-recovery gaps. Together, these results suggest that mild control of low-degree advantage is often sufficient to explain computational barriers for recovery in high-dimensional statistical models.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2604.1741

Country:

North America > United States (0.28)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Optimal Cluster Recovery in the Labeled Stochastic Block Model

Se-Young Yun, Alexandre Proutiere

Neural Information Processing SystemsMar-23-2026, 20:54:05 GMT

We consider the problem of community detection or clustering in the labeled Stochastic Block Model (LSBM) with a finite number K of clusters of sizes linearly growing with the global population of items n. Every pair of items is labeled independently at random, and label ` appears with probability p(i,j,`) between two items in clusters indexed by iand j, respectively. The objective is to reconstruct the clusters from the observation of these random labels. Clustering under the SBM and their extensions has attracted much attention recently. Most existing work aimed at characterizing the set of parameters such that it is possible to infer clusters either positively correlated with the true clusters, or with a vanishing proportion of misclassified items, or exactly matching the true clusters. We find the set of parameters such that there exists a clustering algorithm with at most s misclassified items in average under the general LSBM and for any s = o(n), which solves one open problem raised in [2]. We further develop an algorithm, based on simple spectral methods, that achieves this fundamental performance limit within O(npolylog(n)) computations and without the a-priori knowledge of the model parameters.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: